pacman::p_load(tmap, sf, sfdep, dplyr, mapview, ggpubr, tidyverse)Take-Home Exercise 1
Setting the Scene
As urban systems digitize, encompassing buses, taxis, transit, utilities, and roads, the resulting datasets offer a comprehensive framework for tracing movement over both space and time. This surge in pervasive technologies like GPS and RFID integrated into vehicles amplifies this capability. For instance, smart cards and GPS devices in public buses collect extensive data on routes and ridership, potentially unveiling intricate patterns that divulge vital insights into measured phenomena. Analyzing and comparing these patterns could yield profound understanding of human behavior and movement within cities. These insights hold promise for enhancing urban management and providing crucial information to both public and private urban transport services for informed decision-making and competitive edge.
Despite the potential, the real-world application of this abundant location-aware data tends to be limited to basic tracking and mapping using GIS applications. This limitation stems from conventional GIS tools lacking the necessary functionality to effectively analyze and model spatial and spatio-temporal data.
Objectives
Exploratory Spatial Data Analysis (ESDA) hold tremendous potential to address complex problems facing society. Applying Geovisualisation and Analysis and Local Indicators of Spatial Association (GLISA) to undercover the spatial and spatio-temporal mobility patterns of public bus passengers in Singapore.
Getting started
Loading all the package that could be use for this into R.
Geospatial Data
Import the BusStop data, this data included the information of all the bus stops located in Singapore. This data is provided by LTA DATA MALL.
BusStop <- st_read(dsn = "data/geospatial",
layer = "BusStop") %>%
st_transform(crs = 3414)Reading layer `BusStop' from data source
`/Users/pirapatchaiya/Library/Mobile Documents/com~apple~CloudDocs/SMU MITB NOV TERM/Applied Geospatial Analytics/ISSS624/ISSS624/Take-Home_Ex/Take-Home_Ex1/data/geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
tmap_mode("view")Plotting the raw data, this plot show all the BusStops location
tm_shape(BusStop) +
tm_dots()Hexagonal Grid Creation
The code chunk below uses the sf package to create a grid of polygons around bus stops with a specific cell size and then converts this grid into an sf object while assigning unique grid IDs to each cell.
grid = st_make_grid(BusStop, c(500), what = "polygons", square = FALSE)
# sf and add grid ID
grid_sf = st_sf(grid) %>%
# add grid ID
mutate(grid_id = 1:length(lengths(grid)))This code chunk below determines how many bus stops intersect with each grid cell. Then, it filters out grid cells that don’t contain any bus stops, keeping only those cells with at least one bus stop inside.
grid_sf$n_colli = lengths(st_intersects(grid_sf, BusStop))
# remove grid without value of 0 (i.e. no points in side that grid)
grid_count = filter(grid_sf,n_colli > 0 )tmap_mode("view")Plotting hexagon map
tm_shape(grid_count) +
tm_fill(
col = "n_colli",
palette = "Blues",
style = "cont",
title = "Number of collisions",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.vars = c(
"Number of collisions: " = "n_colli"
),
popup.format = list(
n_colli = list(format = "f", digits = 0)
)
) +
tm_borders(col = "grey40", lwd = 0.7)Next, as we can see there are some busstop that is outside Singapore. In this case it will be exclude.
Removing the Bus Stops outside Singapore using the code below
grid_count_rm <- grid_count %>%
filter(!grid_id == 1767,
!grid_id == 2073,
!grid_id == 2135,
!grid_id == 2104)New Plot after removed the Bus Stops outside Singapore
tm_shape(grid_count_rm) +
tm_fill(
col = "n_colli",
palette = "Blues",
style = "cont",
title = "Number of collisions",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.vars = c(
"Number of collisions: " = "n_colli"
),
popup.format = list(
n_colli = list(format = "f", digits = 0)
)
) +
tm_borders(col = "grey40", lwd = 0.7)Aspatial Data
Import the aspatial data into R, the data that will be use here is the “origin_destination_bus_202310”. This data is originally from the LTA DATA MALL.
busTrips <- read_csv("data/aspatial/origin_destination_bus_202310.csv")# busTrips <- read_csv("data/aspatial/origin_destination_bus_202308.csv")
# busTrips <- read_csv("data/aspatial/origin_destination_bus_202309.csv")
busTrips$ORIGIN_PT_CODE <- as.factor(busTrips$ORIGIN_PT_CODE)
busTrips$DESTINATION_PT_CODE <- as.factor(busTrips$DESTINATION_PT_CODE)Data Wrangling
Aspatial Data Wrangling
Calculating bus trip in weekday and morning peak hour
busTripsDayMorning <- busTrips %>%
filter(DAY_TYPE == "WEEKDAY",
TIME_PER_HOUR >= 6,
TIME_PER_HOUR <= 9) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(WeekdayMorningTrips = sum(TOTAL_TRIPS))Calculating bus trip in weekday and afternoon peak hour
busTripsDayAfternoon <- busTrips %>%
filter(DAY_TYPE == "WEEKDAY",
TIME_PER_HOUR >= 17,
TIME_PER_HOUR <= 20) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(WeekdayAfternoonTrips = sum(TOTAL_TRIPS))Calculating bus trip in weekend and morning peak hour
busTripsEndMorning <- busTrips %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY",
TIME_PER_HOUR >= 11,
TIME_PER_HOUR <= 14) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(WeekendMorningTrips = sum(TOTAL_TRIPS))Calculating bus trip in weekend and evening peak hour
busTripsEndEvening <- busTrips %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY",
TIME_PER_HOUR >= 16,
TIME_PER_HOUR <= 19) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(WeekendEveningTrips = sum(TOTAL_TRIPS))Joining all the Peak Trips
BusTrips_comb <- busTripsDayMorning %>%
left_join(busTripsDayAfternoon) %>%
left_join(busTripsEndMorning) %>%
left_join(busTripsEndEvening)Geospatial Data Wrangling
grid_bus <- st_join(grid_count_rm,BusStop,join = st_contains) %>%
unnest() %>%
select(grid_id,BUS_STOP_N)grid_bus$BUS_STOP_N <- as.factor(grid_bus$BUS_STOP_N)Joining aspatial and geospatial
Joining both aspatial and geospatail data together
Trips <- left_join(BusTrips_comb,grid_bus,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
group_by(grid_id)%>%
summarise(WeekdayMorningTrips = sum(WeekdayMorningTrips),
WeekdayAfternoonTrips = sum(WeekdayAfternoonTrips),
WeekendMorningTrips = sum(WeekendMorningTrips),
WeekendEveningTrips = sum(WeekendEveningTrips))Trips <- left_join(grid_count_rm,Trips) %>%
mutate (Total_Trips = WeekdayMorningTrips+WeekdayAfternoonTrips+WeekendMorningTrips +WeekendEveningTrips) %>%
rename (n_bus = n_colli)Trip amount per bus stop, to determine whether the quantity of bus stations or whether it is indeed crowded.
TripsPerBusStop <- Trips %>%
mutate (WeekdayMorningTrips = WeekdayMorningTrips/n_bus,
WeekdayAfternoonTrips = WeekdayAfternoonTrips/n_bus,
WeekendMorningTrips = WeekendMorningTrips/n_bus,
WeekendEveningTrips = WeekendEveningTrips/n_bus)TripsLog <- Trips %>%
mutate (WeekdayMorningTrips = log(WeekdayMorningTrips),
WeekdayAfternoonTrips = log(WeekdayAfternoonTrips),
WeekendMorningTrips = log(WeekendMorningTrips),
WeekendEveningTrips = log(WeekendEveningTrips))Visualising
tmap_mode("view")The map below displays the total bus trip for each bus station of origin.
tm_shape(Trips) +
tm_fill(
col = "Total_Trips",
palette = "Blues",
style = "quantile",
title = "Total Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.format = list(
Total_Trips = list(format = "f", digits = 0)
)
) +
tm_borders(col = "grey40", lwd = 0.7)It is evident from the above plot of total trips that bus travel is dispersed throughout the nation. On the other hand, many clusters are visible.
Total Trips
weekday_morning <- tm_shape(Trips) +
tm_fill(
col = "WeekdayMorningTrips",
palette = "Blues",
style = "quantile",
title = "Weekday Morning Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekday_afternoon <- tm_shape(Trips) +
tm_fill(
col = "WeekdayAfternoonTrips",
palette = "Blues",
style = "quantile",
title = "Weekday Afternoon Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekend_morning <- tm_shape(Trips) +
tm_fill(
col = "WeekendMorningTrips",
palette = "Blues",
style = "quantile",
title = "Weekend Morning Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekend_evening <- tm_shape(Trips) +
tm_fill(
col = "WeekendEveningTrips",
palette = "Blues",
style = "quantile",
title = "Weekend Evening Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)tmap_mode("plot")tmap_arrange(weekday_morning, weekday_afternoon, weekend_morning, weekend_evening, ncol=2, nrow=2)
The segmented trips in the map above are broken down into four sector: weekend morning, weekend afternoon, and weekend morning. The distinctions between them are negligible. On the east side, there is a noticeable difference, though. On weekends and weekdays, the east side quantile is lower in the evening or afternoon than it is in the morning. Additionally, it is evident in the central that in the afternoon and evening, their quantile is higher.
Total Trips per Bus Stop
weekday_morning_per_stop <- tm_shape(TripsPerBusStop) +
tm_fill(
col = "WeekdayMorningTrips",
palette = "Blues",
style = "quantile",
title = "Weekday Morning Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekday_afternoon_per_stop <- tm_shape(TripsPerBusStop) +
tm_fill(
col = "WeekdayAfternoonTrips",
palette = "Blues",
style = "quantile",
title = "Weekday Afternoon Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekend_morning_per_stop <- tm_shape(TripsPerBusStop) +
tm_fill(
col = "WeekendMorningTrips",
palette = "Blues",
style = "quantile",
title = "Weekend Morning Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekend_evening_per_stop <- tm_shape(TripsPerBusStop) +
tm_fill(
col = "WeekendEveningTrips",
palette = "Blues",
style = "quantile",
title = "Weekend Evening Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)tmap_mode("plot")tmap_arrange(weekday_morning_per_stop, weekday_afternoon_per_stop,
weekend_morning_per_stop, weekend_evening_per_stop,
ncol=2, nrow=2) 
The diagram above demonstrates how the overall number of trips varies depending on the day and time of day. Each bus station in a hexagon has a different amount of total trips within it. Still, the outcomes are strikingly identical to the last one.
Log value of trips
weekday_morning_log <- tm_shape(TripsLog) +
tm_fill(
col = "WeekdayMorningTrips",
palette = "Blues",
style = "quantile",
title = "Weekday Morning Trips per Bus Stop",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekday_afternoon_log <- tm_shape(TripsLog) +
tm_fill(
col = "WeekdayAfternoonTrips",
palette = "Blues",
style = "quantile",
title = "Weekday Afternoon Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekend_morning_log <- tm_shape(TripsLog) +
tm_fill(
col = "WeekendMorningTrips",
palette = "Blues",
style = "quantile",
title = "Weekend Morning Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)weekend_evening_log <- tm_shape(TripsLog) +
tm_fill(
col = "WeekendEveningTrips",
palette = "Blues",
style = "quantile",
title = "Weekend Evening Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.6
) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_legend(scale = 0.5)tmap_mode("plot")tmap_arrange(weekday_morning_log, weekday_afternoon_log,
weekend_morning_log, weekend_evening_log, ncol=2, nrow=2)
When the number is logarithmic, the value of the total number of trips for each hexagon is displayed in the figure above. Additionally, there aren’t many differences between them.
Local Indicators of Spatial Association (LISA) Analysis
Spatial Weight
wm_idw <- Trips %>%
mutate(nb = st_dist_band(grid),
wts = st_inverse_distance(nb, grid,
scale = 1,
alpha = 1),
.before = 1) %>%
mutate(grid_id = 1:length(Trips$grid_id))Local Moran
Weekday Morning
lisa_WDM <- wm_idw %>%
mutate(local_moran = local_moran(
WeekdayMorningTrips, nb, wts, nsim = 99, na.action=na.pass),
.before = 1) %>%
unnest(local_moran)ii_val_moran_WDM <- tm_shape(lisa_WDM) +
tm_fill("ii") +
tm_borders(alpha = 0.5) +
tm_view(set.zoom.limits = c(6,8)) +
tm_layout(main.title = "local Moran's I of Trip",
main.title.size = 0.8)p_val_moran_WDM <- tm_shape(lisa_WDM) +
tm_fill("p_ii_sim") +
tm_borders(alpha = 0.5) +
tm_layout(main.title = "p-value of local Moran's I",
main.title.size = 0.8)tmap_arrange(ii_val_moran_WDM, p_val_moran_WDM, ncol = 2)
lisa_sig_WDM <- lisa_WDM %>%
filter(p_ii_sim < 0.05)
pal <- RColorBrewer::brewer.pal("Set1", n = 5)
weekday_morning_localmoran <- tm_shape(lisa_WDM) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(lisa_sig_WDM) +
tm_fill("mean", palette = pal , title = "Weekday Morning Trips") +
tm_borders(alpha = 0.4)Weekday Afternoon
lisa_WDA <- wm_idw %>%
mutate(local_moran = local_moran(
WeekdayAfternoonTrips, nb, wts, nsim = 99, na.action=na.pass),
.before = 1) %>%
unnest(local_moran)ii_val_moran_WDA <- tm_shape(lisa_WDA) +
tm_fill("ii") +
tm_borders(alpha = 0.5) +
tm_view(set.zoom.limits = c(6,8)) +
tm_layout(main.title = "local Moran's I of Trip",
main.title.size = 0.8)p_val_moran_WDA <- tm_shape(lisa_WDA) +
tm_fill("p_ii_sim") +
tm_borders(alpha = 0.5) +
tm_layout(main.title = "p-value of local Moran's I",
main.title.size = 0.8)tmap_arrange(ii_val_moran_WDA, p_val_moran_WDA, ncol = 2)
lisa_sig_WDA <- lisa_WDA %>%
filter(p_ii_sim < 0.05)
pal <- RColorBrewer::brewer.pal("Set1", n = 5)
weekday_afternoon_localmoran <- tm_shape(lisa_sig_WDA) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(lisa_sig_WDA) +
tm_fill("mean", palette = pal , title = "Weekday Afternoon Trips") +
tm_borders(alpha = 0.4)Weekend Morning
lisa_WEM <- wm_idw %>%
mutate(local_moran = local_moran(
WeekendMorningTrips, nb, wts, nsim = 99, na.action=na.pass),
.before = 1) %>%
unnest(local_moran)ii_val_moran_WEM <- tm_shape(lisa_WEM) +
tm_fill("ii") +
tm_borders(alpha = 0.5) +
tm_view(set.zoom.limits = c(6,8)) +
tm_layout(main.title = "local Moran's I of Trip",
main.title.size = 0.8)p_val_moran_WEM <- tm_shape(lisa_WEM) +
tm_fill("p_ii_sim") +
tm_borders(alpha = 0.5) +
tm_layout(main.title = "p-value of local Moran's I",
main.title.size = 0.8)tmap_arrange(ii_val_moran_WEM, p_val_moran_WEM, ncol = 2)
lisa_sig_WEM <- lisa_WEM %>%
filter(p_ii_sim < 0.05)
pal <- RColorBrewer::brewer.pal("Set1", n = 5)
weekend_morning_localmoran <- tm_shape(lisa_WEM) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(lisa_sig_WEM) +
tm_fill("mean", palette = pal , title = "Weekend Morning Trips") +
tm_borders(alpha = 0.4)Weekend Evening
lisa_WEE <- wm_idw %>%
mutate(local_moran = local_moran(
WeekendEveningTrips, nb, wts, nsim = 99, na.action=na.pass),
.before = 1) %>%
unnest(local_moran)ii_val_moran_WEE <- tm_shape(lisa_WEE) +
tm_fill("ii") +
tm_borders(alpha = 0.5) +
tm_view(set.zoom.limits = c(6,8)) +
tm_layout(main.title = "local Moran's I of Trip",
main.title.size = 0.8)p_val_moran_WEE <- tm_shape(lisa_WEE) +
tm_fill("p_ii_sim") +
tm_borders(alpha = 0.5) +
tm_layout(main.title = "p-value of local Moran's I",
main.title.size = 0.8)tmap_arrange(ii_val_moran_WEE, p_val_moran_WEE, ncol = 2)
lisa_sig_WEE <- lisa_WEE %>%
filter(p_ii_sim < 0.05)
pal <- RColorBrewer::brewer.pal("Set1", n = 5)
weekend_evening_localmoran <- tm_shape(lisa_WEE) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(lisa_sig_WEE) +
tm_fill("mean", palette = pal , title = "Weekend Evening Trips") +
tm_borders(alpha = 0.4)LISA Map
Summary all the 4 category, below is the 4 LISA Map in 4 category (weekday_morning, weekday_afternoon, weekend_morning, weekend_evening) using tmap_arrange.
tmap_arrange(weekday_morning_localmoran, weekday_afternoon_localmoran,
weekend_morning_localmoran, weekend_evening_localmoran,
ncol = 2, nrow = 2)
Conclusion
Using the inverse distance weight, the map above displays the local Moran relationship between each hexagonal grid. The 2 top row is weekday morning and weekday afternoon. The 2 in bottom row is weekend morning and weekend afternoon.
From the graph above compare the different between weekday and weekend. On weekday most of the clustering is Low-High follow by High-Low, there is only on Low-Low on weekday morning. For Weekend, there are more Low-Low and High-High compare to weekday. There is more mix cluster on weekend.
The fact that individuals left for work or school in the morning and returned home in the afternoon lends credence to both of the weekday graphs. As it can observed from the map, on weekday mornings the spread occurs more in the suburbs of Singapore, however in the afternoon it is closer to the city core.The weekend graphs show that throughout the morning, the most of the dispersion and clustering occurred in the central region, and during the night, it was slightly dispersed.